The Metadata Troll Detector

نویسندگان

  • Stephan Dollberg
  • Tobias Langner
  • Jochen Seidel
چکیده

Reddit.com offers a discussion platform for various topics. Given its huge size and its comment based discussion structure it attracts internet trolls. We write a bot that crawls Reddit for new comments and that tries to automatically detect trolling in them. In contrast to conventional solutions we do not only use text analysis of the comments but also take metadata into account. Metadata describes properties such as the degree of the participation in a discussion. For classification, we compile several characteristics of a comment based on data and metadata. These characteristics are used to train the machine learning algorithms of support vector machines to classify each comment. We show that it is generally very difficult to automatically detect trolls and see that metadata based approaches are still inferior to text based ones. However, the combination of both shows promising results.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Finding Public Opinion Manipulation Trolls in Bulgarian Online News Media

With the rise of social media, it became normal for people to read and follow other users' opinion. This created the opportunity for corporations, governments and others to distribute rumors, misinformation, and speculation and to use other dishonest practices to manipulate public opinion (Derczynski and Bontcheva , 2014). They could consistently use trolls (Cambria, Chandra and Sharma , 2010),...

متن کامل

A readability level prediction tool for K-12 books

The readability level of a book is a useful measure for children and teenagers (teachers, parents, and librarians, respectively) to identify reading materials suitable for themselves (their K-12 readers, respectively). Unfortunately, majority of published books are assigned a readability level range, such as K-3, instead of a single readability level for their intended readers, by professionals...

متن کامل

Metadata Enrichment for Automatic Data Entry Based on Relational Data Models

The idea of automatic generation of data entry forms based on data relational models is a common and known idea that has been discussed day by day more than before according to the popularity of agile methods in software development accompanying development of programming tools. One of the requirements of the automation methods, whether in commercial products or the relevant research projects, ...

متن کامل

The TROLL Approach to Conceptual Modeling: Syntax, Semantics and Tools

In this paper, we present the use of Troll for the conceptual modelling of distributed information systems. Troll offers both textual and graphical notations. Troll has been used in practice to model an industrial information system. We use an extract of this case study to describe briefly the syntax and underlying semantics of the language. We also show a set of software tools that are being d...

متن کامل

بررسی واکنش موتورهای کاوش وب به پیشینه‌های فرادا‌ده‌ای مبتنی برروش ترکیبی داده‌های خرد و روش داده‌های پیوندی

The purpose of this research was to find out the reaction of Web Search Engines to Metadata records created based on the combined method of Rich Snippets and Linked Data. 200 metadata records in two groups (100 records as the control group with the normal structure and, 100 records created based on microdata and implemented in RDF/XML as experimental group) extracted from the information gatewa...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015